Overview

Dataset statistics

Number of variables20
Number of observations734932
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory122.7 MiB
Average record size in memory175.0 B

Variable types

Numeric9
DateTime1
Categorical9
Unsupported1

Alerts

day_of_week is highly overall correlated with openHigh correlation
customers is highly overall correlated with salesHigh correlation
promo2_since_year is highly overall correlated with promo2High correlation
sales is highly overall correlated with customers and 1 other fieldsHigh correlation
open is highly overall correlated with day_of_week and 1 other fieldsHigh correlation
store_type is highly overall correlated with assortmentHigh correlation
assortment is highly overall correlated with store_typeHigh correlation
promo2 is highly overall correlated with promo2_since_yearHigh correlation
state_holiday is highly imbalanced (88.2%)Imbalance
promo_interval is an unsupported type, check if it needs cleaning or further analysisUnsupported
customers has 124977 (17.0%) zerosZeros
sales has 124979 (17.0%) zerosZeros

Reproduction

Analysis started2023-06-12 18:58:59.203622
Analysis finished2023-06-12 18:59:43.698889
Duration44.5 seconds
Software versionpandas-profiling v0.0.dev0
Download configurationconfig.json

Variables

store
Real number (ℝ)

Distinct1115
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean558.51995
Minimum1
Maximum1115
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.2 MiB
2023-06-12T19:59:43.773407image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile56
Q1281
median558
Q3837
95-th percentile1060
Maximum1115
Range1114
Interquartile range (IQR)556

Descriptive statistics

Standard deviation321.87006
Coefficient of variation (CV)0.57629107
Kurtosis-1.199737
Mean558.51995
Median Absolute Deviation (MAD)278
Skewness-0.0014385114
Sum4.1047418 × 108
Variance103600.34
MonotonicityNot monotonic
2023-06-12T19:59:43.876439image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
885 727
 
0.1%
116 720
 
0.1%
381 719
 
0.1%
881 717
 
0.1%
102 716
 
0.1%
544 715
 
0.1%
757 713
 
0.1%
246 712
 
0.1%
511 711
 
0.1%
847 711
 
0.1%
Other values (1105) 727771
99.0%
ValueCountFrequency (%)
1 676
0.1%
2 681
0.1%
3 689
0.1%
4 683
0.1%
5 682
0.1%
6 685
0.1%
7 670
0.1%
8 675
0.1%
9 668
0.1%
10 680
0.1%
ValueCountFrequency (%)
1115 674
0.1%
1114 686
0.1%
1113 687
0.1%
1112 690
0.1%
1111 668
0.1%
1110 670
0.1%
1109 564
0.1%
1108 684
0.1%
1107 568
0.1%
1106 674
0.1%

day_of_week
Real number (ℝ)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9977495
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.2 MiB
2023-06-12T19:59:43.979637image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.9977421
Coefficient of variation (CV)0.4997167
Kurtosis-1.2470392
Mean3.9977495
Median Absolute Deviation (MAD)2
Skewness0.0017088695
Sum2938074
Variance3.9909737
MonotonicityNot monotonic
2023-06-12T19:59:44.078234image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
5 105361
14.3%
4 105338
14.3%
3 105268
14.3%
2 105114
14.3%
1 104757
14.3%
7 104564
14.2%
6 104530
14.2%
ValueCountFrequency (%)
1 104757
14.3%
2 105114
14.3%
3 105268
14.3%
4 105338
14.3%
5 105361
14.3%
6 104530
14.2%
7 104564
14.2%
ValueCountFrequency (%)
7 104564
14.2%
6 104530
14.2%
5 105361
14.3%
4 105338
14.3%
3 105268
14.3%
2 105114
14.3%
1 104757
14.3%

date
Date

Distinct942
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.2 MiB
Minimum2013-01-01 00:00:00
Maximum2015-07-31 00:00:00
2023-06-12T19:59:44.179759image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:44.346956image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

customers
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3962
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean632.96202
Minimum0
Maximum7388
Zeros124977
Zeros (%)17.0%
Negative0
Negative (%)0.0%
Memory size11.2 MiB
2023-06-12T19:59:44.461473image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1404
median609
Q3837
95-th percentile1361
Maximum7388
Range7388
Interquartile range (IQR)433

Descriptive statistics

Standard deviation464.21296
Coefficient of variation (CV)0.7333978
Kurtosis7.1147692
Mean632.96202
Median Absolute Deviation (MAD)216
Skewness1.5982741
Sum4.6518404 × 108
Variance215493.67
MonotonicityNot monotonic
2023-06-12T19:59:44.564179image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 124977
 
17.0%
560 1796
 
0.2%
576 1727
 
0.2%
533 1699
 
0.2%
555 1694
 
0.2%
651 1692
 
0.2%
566 1687
 
0.2%
614 1684
 
0.2%
603 1683
 
0.2%
517 1670
 
0.2%
Other values (3952) 594623
80.9%
ValueCountFrequency (%)
0 124977
17.0%
3 1
 
< 0.1%
5 1
 
< 0.1%
8 1
 
< 0.1%
13 1
 
< 0.1%
36 1
 
< 0.1%
40 1
 
< 0.1%
50 1
 
< 0.1%
61 1
 
< 0.1%
64 1
 
< 0.1%
ValueCountFrequency (%)
7388 1
< 0.1%
5494 1
< 0.1%
5387 1
< 0.1%
5297 1
< 0.1%
5192 1
< 0.1%
5152 1
< 0.1%
5112 1
< 0.1%
5106 1
< 0.1%
5090 1
< 0.1%
5069 1
< 0.1%

open
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.2 MiB
1
609995 
0
124937 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters734932
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1 609995
83.0%
0 124937
 
17.0%

Length

2023-06-12T19:59:44.661762image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-12T19:59:44.766866image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
1 609995
83.0%
0 124937
 
17.0%

Most occurring characters

ValueCountFrequency (%)
1 609995
83.0%
0 124937
 
17.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 734932
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 609995
83.0%
0 124937
 
17.0%

Most occurring scripts

ValueCountFrequency (%)
Common 734932
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 609995
83.0%
0 124937
 
17.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 734932
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 609995
83.0%
0 124937
 
17.0%

promo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.2 MiB
0
454226 
1
280706 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters734932
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 454226
61.8%
1 280706
38.2%

Length

2023-06-12T19:59:44.849518image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-12T19:59:44.954356image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 454226
61.8%
1 280706
38.2%

Most occurring characters

ValueCountFrequency (%)
0 454226
61.8%
1 280706
38.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 734932
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 454226
61.8%
1 280706
38.2%

Most occurring scripts

ValueCountFrequency (%)
Common 734932
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 454226
61.8%
1 280706
38.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 734932
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 454226
61.8%
1 280706
38.2%

state_holiday
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.2 MiB
0
712422 
a
 
14691
b
 
4844
c
 
2975

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters734932
Distinct characters4
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 712422
96.9%
a 14691
 
2.0%
b 4844
 
0.7%
c 2975
 
0.4%

Length

2023-06-12T19:59:45.056648image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-12T19:59:45.177186image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 712422
96.9%
a 14691
 
2.0%
b 4844
 
0.7%
c 2975
 
0.4%

Most occurring characters

ValueCountFrequency (%)
0 712422
96.9%
a 14691
 
2.0%
b 4844
 
0.7%
c 2975
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 712422
96.9%
Lowercase Letter 22510
 
3.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 14691
65.3%
b 4844
 
21.5%
c 2975
 
13.2%
Decimal Number
ValueCountFrequency (%)
0 712422
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 712422
96.9%
Latin 22510
 
3.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 14691
65.3%
b 4844
 
21.5%
c 2975
 
13.2%
Common
ValueCountFrequency (%)
0 712422
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 734932
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 712422
96.9%
a 14691
 
2.0%
b 4844
 
0.7%
c 2975
 
0.4%

school_holiday
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.2 MiB
0
603648 
1
131284 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters734932
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 603648
82.1%
1 131284
 
17.9%

Length

2023-06-12T19:59:45.282620image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-12T19:59:45.399166image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 603648
82.1%
1 131284
 
17.9%

Most occurring characters

ValueCountFrequency (%)
0 603648
82.1%
1 131284
 
17.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 734932
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 603648
82.1%
1 131284
 
17.9%

Most occurring scripts

ValueCountFrequency (%)
Common 734932
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 603648
82.1%
1 131284
 
17.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 734932
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 603648
82.1%
1 131284
 
17.9%

store_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.2 MiB
a
398944 
d
225963 
c
98696 
b
 
11329

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters734932
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa
2nd rowa
3rd rowa
4th rowa
5th rowc

Common Values

ValueCountFrequency (%)
a 398944
54.3%
d 225963
30.7%
c 98696
 
13.4%
b 11329
 
1.5%

Length

2023-06-12T19:59:45.475924image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-12T19:59:45.571985image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
a 398944
54.3%
d 225963
30.7%
c 98696
 
13.4%
b 11329
 
1.5%

Most occurring characters

ValueCountFrequency (%)
a 398944
54.3%
d 225963
30.7%
c 98696
 
13.4%
b 11329
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 734932
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 398944
54.3%
d 225963
30.7%
c 98696
 
13.4%
b 11329
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 734932
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 398944
54.3%
d 225963
30.7%
c 98696
 
13.4%
b 11329
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 734932
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 398944
54.3%
d 225963
30.7%
c 98696
 
13.4%
b 11329
 
1.5%

assortment
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.2 MiB
a
388938 
c
340027 
b
 
5967

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters734932
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc
2nd rowc
3rd rowc
4th rowa
5th rowa

Common Values

ValueCountFrequency (%)
a 388938
52.9%
c 340027
46.3%
b 5967
 
0.8%

Length

2023-06-12T19:59:45.662783image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-12T19:59:45.760537image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
a 388938
52.9%
c 340027
46.3%
b 5967
 
0.8%

Most occurring characters

ValueCountFrequency (%)
a 388938
52.9%
c 340027
46.3%
b 5967
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 734932
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 388938
52.9%
c 340027
46.3%
b 5967
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 734932
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 388938
52.9%
c 340027
46.3%
b 5967
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 734932
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 388938
52.9%
c 340027
46.3%
b 5967
 
0.8%

competition_distance
Real number (ℝ)

Distinct655
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5928.7621
Minimum20
Maximum200000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.2 MiB
2023-06-12T19:59:46.042370image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile130
Q1710
median2330
Q36900
95-th percentile20620
Maximum200000
Range199980
Interquartile range (IQR)6190

Descriptive statistics

Standard deviation12551.454
Coefficient of variation (CV)2.1170447
Kurtosis147.86516
Mean5928.7621
Median Absolute Deviation (MAD)1980
Skewness10.248645
Sum4.357237 × 109
Variance1.5753901 × 108
MonotonicityNot monotonic
2023-06-12T19:59:46.149423image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
250 8017
 
1.1%
350 5433
 
0.7%
50 5427
 
0.7%
190 5396
 
0.7%
1200 5308
 
0.7%
90 4771
 
0.6%
180 4759
 
0.6%
330 4615
 
0.6%
150 4537
 
0.6%
2640 4108
 
0.6%
Other values (645) 682561
92.9%
ValueCountFrequency (%)
20 675
 
0.1%
30 2746
0.4%
40 3463
0.5%
50 5427
0.7%
60 2051
 
0.3%
70 3276
0.4%
80 2044
 
0.3%
90 4771
0.6%
100 3405
0.5%
110 3906
0.5%
ValueCountFrequency (%)
200000 1912
0.3%
75860 674
 
0.1%
58260 679
 
0.1%
48330 677
 
0.1%
46590 677
 
0.1%
45740 661
 
0.1%
44320 695
 
0.1%
40860 697
 
0.1%
40540 662
 
0.1%
38710 701
 
0.1%
Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.7827976
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.4 MiB
2023-06-12T19:59:46.256948image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.3105199
Coefficient of variation (CV)0.48807588
Kurtosis-1.2326458
Mean6.7827976
Median Absolute Deviation (MAD)3
Skewness-0.039781499
Sum4984895
Variance10.959542
MonotonicityNot monotonic
2023-06-12T19:59:46.328966image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
9 96573
13.1%
4 85963
11.7%
11 74996
10.2%
3 69994
9.5%
7 65563
8.9%
12 56394
7.7%
6 55917
7.6%
10 54732
7.4%
5 52526
7.1%
2 48863
6.6%
Other values (2) 73411
10.0%
ValueCountFrequency (%)
1 32741
 
4.5%
2 48863
6.6%
3 69994
9.5%
4 85963
11.7%
5 52526
7.1%
6 55917
7.6%
7 65563
8.9%
8 40670
5.5%
9 96573
13.1%
10 54732
7.4%
ValueCountFrequency (%)
12 56394
7.7%
11 74996
10.2%
10 54732
7.4%
9 96573
13.1%
8 40670
5.5%
7 65563
8.9%
6 55917
7.6%
5 52526
7.1%
4 85963
11.7%
3 69994
9.5%
Distinct23
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.3233
Minimum1900
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.4 MiB
2023-06-12T19:59:46.411987image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1900
5-th percentile2002
Q12008
median2012
Q32014
95-th percentile2015
Maximum2015
Range115
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.536153
Coefficient of variation (CV)0.0027538621
Kurtosis125.26688
Mean2010.3233
Median Absolute Deviation (MAD)2
Skewness-7.3090883
Sum1.4774509 × 109
Variance30.64899
MonotonicityNot monotonic
2023-06-12T19:59:46.525715image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
2013 147998
20.1%
2014 132033
18.0%
2015 79378
10.8%
2012 53849
 
7.3%
2005 40765
 
5.5%
2010 37027
 
5.0%
2009 35809
 
4.9%
2011 35669
 
4.9%
2008 35010
 
4.8%
2007 31502
 
4.3%
Other values (13) 105892
14.4%
ValueCountFrequency (%)
1900 562
 
0.1%
1961 681
 
0.1%
1990 3390
 
0.5%
1994 1384
 
0.2%
1995 1238
 
0.2%
1998 674
 
0.1%
1999 5275
 
0.7%
2000 6659
 
0.9%
2001 10728
1.5%
2002 17838
2.4%
ValueCountFrequency (%)
2015 79378
10.8%
2014 132033
18.0%
2013 147998
20.1%
2012 53849
 
7.3%
2011 35669
 
4.9%
2010 37027
 
5.0%
2009 35809
 
4.9%
2008 35010
 
4.8%
2007 31502
 
4.3%
2006 30966
 
4.2%

promo2
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.2 MiB
1
368064 
0
366868 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters734932
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 368064
50.1%
0 366868
49.9%

Length

2023-06-12T19:59:46.632553image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-12T19:59:46.723642image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
1 368064
50.1%
0 366868
49.9%

Most occurring characters

ValueCountFrequency (%)
1 368064
50.1%
0 366868
49.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 734932
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 368064
50.1%
0 366868
49.9%

Most occurring scripts

ValueCountFrequency (%)
Common 734932
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 368064
50.1%
0 366868
49.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 734932
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 368064
50.1%
0 366868
49.9%

promo2_since_week
Real number (ℝ)

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.612393
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.4 MiB
2023-06-12T19:59:46.805176image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q112
median22
Q337
95-th percentile47
Maximum52
Range51
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.307916
Coefficient of variation (CV)0.60594943
Kurtosis-1.1844802
Mean23.612393
Median Absolute Deviation (MAD)12
Skewness0.17935973
Sum17353503
Variance204.71646
MonotonicityNot monotonic
2023-06-12T19:59:46.908695image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14 61114
 
8.3%
40 50652
 
6.9%
10 36560
 
5.0%
31 36167
 
4.9%
5 34059
 
4.6%
1 31226
 
4.2%
13 29873
 
4.1%
37 29102
 
4.0%
22 28940
 
3.9%
18 27986
 
3.8%
Other values (42) 369253
50.2%
ValueCountFrequency (%)
1 31226
4.2%
2 8275
 
1.1%
3 8221
 
1.1%
4 8242
 
1.1%
5 34059
4.6%
6 8955
 
1.2%
7 8191
 
1.1%
8 8303
 
1.1%
9 17191
2.3%
10 36560
5.0%
ValueCountFrequency (%)
52 5319
 
0.7%
51 5345
 
0.7%
50 6041
 
0.8%
49 5902
 
0.8%
48 11417
1.6%
47 5399
 
0.7%
46 5400
 
0.7%
45 26597
3.6%
44 7265
 
1.0%
43 5368
 
0.7%

promo2_since_year
Real number (ℝ)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2012.7932
Minimum2009
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.4 MiB
2023-06-12T19:59:46.992255image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum2009
5-th percentile2009
Q12012
median2013
Q32014
95-th percentile2015
Maximum2015
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.6634744
Coefficient of variation (CV)0.00082645073
Kurtosis-0.21233728
Mean2012.7932
Median Absolute Deviation (MAD)1
Skewness-0.7841154
Sum1.4792661 × 109
Variance2.767147
MonotonicityNot monotonic
2023-06-12T19:59:47.063794image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2013 223052
30.4%
2014 198092
27.0%
2015 89969
12.2%
2011 83174
 
11.3%
2012 52739
 
7.2%
2009 47220
 
6.4%
2010 40686
 
5.5%
ValueCountFrequency (%)
2009 47220
 
6.4%
2010 40686
 
5.5%
2011 83174
 
11.3%
2012 52739
 
7.2%
2013 223052
30.4%
2014 198092
27.0%
2015 89969
12.2%
ValueCountFrequency (%)
2015 89969
12.2%
2014 198092
27.0%
2013 223052
30.4%
2012 52739
 
7.2%
2011 83174
 
11.3%
2010 40686
 
5.5%
2009 47220
 
6.4%

promo_interval
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size11.2 MiB

sales
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct20629
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5770.9286
Minimum0
Maximum41551
Zeros124979
Zeros (%)17.0%
Negative0
Negative (%)0.0%
Memory size11.2 MiB
2023-06-12T19:59:47.168225image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13725
median5743
Q37857
95-th percentile12120
Maximum41551
Range41551
Interquartile range (IQR)4132

Descriptive statistics

Standard deviation3847.5133
Coefficient of variation (CV)0.66670611
Kurtosis1.7789504
Mean5770.9286
Median Absolute Deviation (MAD)2068
Skewness0.63968359
Sum4.2412401 × 109
Variance14803359
MonotonicityNot monotonic
2023-06-12T19:59:47.280660image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 124979
 
17.0%
5674 157
 
< 0.1%
5931 147
 
< 0.1%
5449 145
 
< 0.1%
5489 145
 
< 0.1%
6615 144
 
< 0.1%
5041 142
 
< 0.1%
5989 141
 
< 0.1%
5558 140
 
< 0.1%
5096 139
 
< 0.1%
Other values (20619) 608653
82.8%
ValueCountFrequency (%)
0 124979
17.0%
46 1
 
< 0.1%
124 1
 
< 0.1%
286 1
 
< 0.1%
297 1
 
< 0.1%
416 1
 
< 0.1%
520 1
 
< 0.1%
530 1
 
< 0.1%
538 1
 
< 0.1%
541 1
 
< 0.1%
ValueCountFrequency (%)
41551 1
< 0.1%
38367 1
< 0.1%
38037 1
< 0.1%
38025 1
< 0.1%
37403 1
< 0.1%
37376 1
< 0.1%
36227 1
< 0.1%
35909 1
< 0.1%
35702 1
< 0.1%
35154 1
< 0.1%

month_map
Categorical

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.2 MiB
Mar
75250 
May
74942 
Jan
74892 
Jun
72526 
Apr
72500 
Other values (7)
364822 

Length

Max length4
Median length3
Mean length3.0604138
Min length3

Characters and Unicode

Total characters2249196
Distinct characters21
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJan
2nd rowFev
3rd rowJan
4th rowJun
5th rowApr

Common Values

ValueCountFrequency (%)
Mar 75250
10.2%
May 74942
10.2%
Jan 74892
10.2%
Jun 72526
9.9%
Apr 72500
9.9%
Jul 70862
9.6%
Fev 67641
9.2%
Aug 46025
6.3%
Oct 45789
6.2%
Dec 45676
6.2%
Other values (2) 88829
12.1%

Length

2023-06-12T19:59:47.377777image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mar 75250
10.2%
may 74942
10.2%
jan 74892
10.2%
jun 72526
9.9%
apr 72500
9.9%
jul 70862
9.6%
fev 67641
9.2%
aug 46025
6.3%
oct 45789
6.2%
dec 45676
6.2%
Other values (2) 88829
12.1%

Most occurring characters

ValueCountFrequency (%)
a 225084
 
10.0%
J 218280
 
9.7%
u 189413
 
8.4%
e 157717
 
7.0%
M 150192
 
6.7%
r 147750
 
6.6%
n 147418
 
6.6%
A 118525
 
5.3%
p 116900
 
5.2%
v 112070
 
5.0%
Other values (11) 665847
29.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1514264
67.3%
Uppercase Letter 734932
32.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 225084
14.9%
u 189413
12.5%
e 157717
10.4%
r 147750
9.8%
n 147418
9.7%
p 116900
7.7%
v 112070
7.4%
c 91465
6.0%
t 90189
6.0%
y 74942
 
4.9%
Other values (3) 161316
10.7%
Uppercase Letter
ValueCountFrequency (%)
J 218280
29.7%
M 150192
20.4%
A 118525
16.1%
F 67641
 
9.2%
O 45789
 
6.2%
D 45676
 
6.2%
N 44429
 
6.0%
S 44400
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2249196
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 225084
 
10.0%
J 218280
 
9.7%
u 189413
 
8.4%
e 157717
 
7.0%
M 150192
 
6.7%
r 147750
 
6.6%
n 147418
 
6.6%
A 118525
 
5.3%
p 116900
 
5.2%
v 112070
 
5.0%
Other values (11) 665847
29.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2249196
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 225084
 
10.0%
J 218280
 
9.7%
u 189413
 
8.4%
e 157717
 
7.0%
M 150192
 
6.7%
r 147750
 
6.6%
n 147418
 
6.6%
A 118525
 
5.3%
p 116900
 
5.2%
v 112070
 
5.0%
Other values (11) 665847
29.6%

is_promo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.2 MiB
0
616567 
1
118365 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters734932
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 616567
83.9%
1 118365
 
16.1%

Length

2023-06-12T19:59:47.469511image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-12T19:59:47.563061image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 616567
83.9%
1 118365
 
16.1%

Most occurring characters

ValueCountFrequency (%)
0 616567
83.9%
1 118365
 
16.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 734932
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 616567
83.9%
1 118365
 
16.1%

Most occurring scripts

ValueCountFrequency (%)
Common 734932
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 616567
83.9%
1 118365
 
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 734932
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 616567
83.9%
1 118365
 
16.1%

Interactions

2023-06-12T19:59:39.365136image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:25.665155image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:27.484031image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:29.107695image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:30.785171image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:32.471488image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:34.125346image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:35.986786image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:37.655186image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:39.538456image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:25.848121image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:27.657743image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:29.303777image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:30.963921image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:32.663514image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:34.327011image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:36.169051image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:37.837863image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:39.728493image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:26.025108image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:27.841126image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:29.496742image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:31.158332image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:32.840885image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:34.512000image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:36.341623image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:38.029477image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:39.907343image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:26.272553image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:28.039796image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:29.695079image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:31.343593image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:33.044029image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:34.708447image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:36.545357image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:38.234305image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:40.086115image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:26.476061image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:28.221904image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:29.886174image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:31.525422image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:33.224033image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:34.899632image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:36.739897image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:38.415813image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:40.259460image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:26.647656image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:28.391323image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:30.063273image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:31.716678image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:33.414111image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:35.076959image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:36.928086image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:38.603518image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:40.432323image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:26.974480image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:28.565376image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:30.247316image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:31.908924image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:33.595890image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:35.252779image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:37.132799image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:38.786323image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:40.600813image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:27.139336image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:28.743416image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:30.424522image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:32.083764image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:33.766623image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:35.608809image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:37.304608image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:38.959333image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:40.785083image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:27.314749image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:28.924283image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:30.609244image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:32.273929image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:33.949999image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:35.805490image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:37.481128image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-06-12T19:59:39.137076image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2023-06-12T19:59:47.650576image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
storeday_of_weekcustomerscompetition_distancecompetition_open_since_monthcompetition_open_since_yearpromo2_since_weekpromo2_since_yearsalesopenpromostate_holidayschool_holidaystore_typeassortmentpromo2month_mapis_promo
store1.0000.0000.021-0.046-0.0330.0040.0050.0080.0000.0060.0000.0000.0020.0950.1110.0720.0000.035
day_of_week0.0001.000-0.4320.001-0.0020.000-0.0030.001-0.4510.8750.4960.1200.2640.0000.0000.0000.0310.011
customers0.021-0.4321.000-0.1800.0030.0150.0320.1210.9030.3270.2790.0710.0470.3180.2730.1810.0240.081
competition_distance-0.0460.001-0.1801.000-0.0240.009-0.0140.030-0.0270.0130.0000.0000.0020.0450.0620.1590.0000.070
competition_open_since_month-0.033-0.0020.003-0.0241.000-0.2330.1080.017-0.0020.0240.0130.0610.1320.0680.0580.1320.3260.084
competition_open_since_year0.0040.0000.0150.009-0.2331.0000.0180.0300.0290.0000.0000.0020.0000.0550.0780.0540.0010.029
promo2_since_week0.005-0.0030.032-0.0140.1080.0181.000-0.1210.0560.0330.0510.1020.1820.0720.0910.2830.4320.160
promo2_since_year0.0080.0010.1210.0300.0170.030-0.1211.0000.0620.0050.0150.0160.0260.0840.1090.6820.0990.304
sales0.000-0.4510.903-0.027-0.0020.0290.0560.0621.0000.7050.4530.1550.0830.1070.0810.1040.0440.051
open0.0060.8750.3270.0130.0240.0000.0330.0050.7051.0000.2950.3790.0870.0510.0390.0080.0740.000
promo0.0000.4960.2790.0000.0130.0000.0510.0150.4530.2951.0000.0540.0670.0000.0000.0000.0520.005
state_holiday0.0000.1200.0710.0000.0610.0020.1020.0160.1550.3790.0541.0000.2130.0020.0010.0110.2200.027
school_holiday0.0020.2640.0470.0020.1320.0000.1820.0260.0830.0870.0670.2131.0000.0030.0020.0070.3850.026
store_type0.0950.0000.3180.0450.0680.0550.0720.0840.1070.0510.0000.0020.0031.0000.5390.1050.0070.045
assortment0.1110.0000.2730.0620.0580.0780.0910.1090.0810.0390.0000.0010.0020.5391.0000.0140.0050.008
promo20.0720.0000.1810.1590.1320.0540.2830.6820.1040.0080.0000.0110.0070.1050.0141.0000.0290.437
month_map0.0000.0310.0240.0000.3260.0010.4320.0990.0440.0740.0520.2200.3850.0070.0050.0291.0000.274
is_promo0.0350.0110.0810.0700.0840.0290.1600.3040.0510.0000.0050.0270.0260.0450.0080.4370.2741.000

Missing values

2023-06-12T19:59:41.254833image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-12T19:59:42.374767image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

storeday_of_weekdatecustomersopenpromostate_holidayschool_holidaystore_typeassortmentcompetition_distancecompetition_open_since_monthcompetition_open_since_yearpromo2promo2_since_weekpromo2_since_yearpromo_intervalsalesmonth_mapis_promo
58209284872014-01-2600000ac370.000720071142011Jan,Apr,Jul,Oct0Jan1
18430533112015-02-168031100ac670.000220151142015Jan,Apr,Jul,Oct7039Fev0
22372572612015-01-1214371100ac40540.00022002032015015611Jan0
6074653772015-06-0700000aa600.00052002112012Jan,Apr,Jul,Oct0Jun0
48546811422014-04-225501001ca4510.000420141482011Mar,Jun,Sept,Dec4260Apr0
20917166772015-01-2500000dc2870.0009201204201500Jan0
99281513622013-01-224241100ac2200.0001220101222012Feb,May,Aug,Nov5360Jan0
75890737822013-08-207641001ac2140.00082012034201304536Aug0
5612234942014-02-134641000dc18010.0009200707201405822Fev0
3399955032015-07-014021100dc50.00062015027201505605Jul0
storeday_of_weekdatecustomersopenpromostate_holidayschool_holidaystore_typeassortmentcompetition_distancecompetition_open_since_monthcompetition_open_since_yearpromo2promo2_since_weekpromo2_since_yearpromo_intervalsalesmonth_mapis_promo
66946613752013-11-0810881100aa1730.000720151402014Jan,Apr,Jul,Oct9252Nov0
99570579672013-01-2000000ac7180.00011201203201300Jan0
35636736912014-08-255131001dc5870.00042014035201405325Aug0
94062435072013-03-1000000da8880.000320131142011Jan,Apr,Jul,Oct0Mar0
6815414072015-05-3100000ac1090.00072010112013Jan,Apr,Jul,Oct0May0
2371530152015-07-106081000ac4510.00032015028201505588Jul0
615742104852013-12-275621001dc1860.000920121402012Jan,Apr,Jul,Oct6416Dec0
78371365412013-07-299501101ca6930.000920060312013010257Jul0
98249796852013-02-019551000ca1190.0002201305201307285Fev0
533203101922014-03-115551000dc2740.000720141132010Jan,Apr,Jul,Oct6006Mar0